23 research outputs found

    Approximating Subadditive Hadamard Functions on Implicit Matrices

    Get PDF
    An important challenge in the streaming model is to maintain small-space approximations of entrywise functions performed on a matrix that is generated by the outer product of two vectors given as a stream. In other works, streams typically define matrices in a standard way via a sequence of updates, as in the work of Woodruff (2014) and others. We describe the matrix formed by the outer product, and other matrices that do not fall into this category, as implicit matrices. As such, we consider the general problem of computing over such implicit matrices with Hadamard functions, which are functions applied entrywise on a matrix. In this paper, we apply this generalization to provide new techniques for identifying independence between two vectors in the streaming model. The previous state of the art algorithm of Braverman and Ostrovsky (2010) gave a (1±ϔ)(1 \pm \epsilon)-approximation for the L1L_1 distance between the product and joint distributions, using space O(log⁥1024(nm)ϔ−1024)O(\log^{1024}(nm) \epsilon^{-1024}), where mm is the length of the stream and nn denotes the size of the universe from which stream elements are drawn. Our general techniques include the L1L_1 distance as a special case, and we give an improved space bound of O(log⁥12(n)log⁥2(nmÏ”)ϔ−7)O(\log^{12}(n) \log^{2}({nm \over \epsilon})\epsilon^{-7})

    Makespan Minimization via Posted Prices

    Full text link
    We consider job scheduling settings, with multiple machines, where jobs arrive online and choose a machine selfishly so as to minimize their cost. Our objective is the classic makespan minimization objective, which corresponds to the completion time of the last job to complete. The incentives of the selfish jobs may lead to poor performance. To reconcile the differing objectives, we introduce posted machine prices. The selfish job seeks to minimize the sum of its completion time on the machine and the posted price for the machine. Prices may be static (i.e., set once and for all before any arrival) or dynamic (i.e., change over time), but they are determined only by the past, assuming nothing about upcoming events. Obviously, such schemes are inherently truthful. We consider the competitive ratio: the ratio between the makespan achievable by the pricing scheme and that of the optimal algorithm. We give tight bounds on the competitive ratio for both dynamic and static pricing schemes for identical, restricted, related, and unrelated machine settings. Our main result is a dynamic pricing scheme for related machines that gives a constant competitive ratio, essentially matching the competitive ratio of online algorithms for this setting. In contrast, dynamic pricing gives poor performance for unrelated machines. This lower bound also exhibits a gap between what can be achieved by pricing versus what can be achieved by online algorithms

    The Bane of Low-Dimensionality Clustering

    Get PDF
    In this paper, we give a conditional lower bound of nΩ(k)n^{\Omega(k)} on running time for the classic k-median and k-means clustering objectives (where n is the size of the input), even in low-dimensional Euclidean space of dimension four, assuming the Exponential Time Hypothesis (ETH). We also consider k-median (and k-means) with penalties where each point need not be assigned to a center, in which case it must pay a penalty, and extend our lower bound to at least three-dimensional Euclidean space. This stands in stark contrast to many other geometric problems such as the traveling salesman problem, or computing an independent set of unit spheres. While these problems benefit from the so-called (limited) blessing of dimensionality, as they can be solved in time nO(k1−1/d)n^{O(k^{1-1/d})} or 2n1−1/d2^{n^{1-1/d}} in d dimensions, our work shows that widely-used clustering objectives have a lower bound of nΩ(k)n^{\Omega(k)}, even in dimension four. We complete the picture by considering the two-dimensional case: we show that there is no algorithm that solves the penalized version in time less than no(k)n^{o(\sqrt{k})}, and provide a matching upper bound of nO(k)n^{O(\sqrt{k})}. The main tool we use to establish these lower bounds is the placement of points on the moment curve, which takes its inspiration from constructions of point sets yielding Delaunay complexes of high complexity

    Zero-One Laws for Sliding Windows and Universal Sketches

    Get PDF
    Given a stream of data, a typical approach in streaming algorithms is to design a sophisticated algorithm with small memory that computes a specific statistic over the streaming data. Usually, if one wants to compute a different statistic after the stream is gone, it is impossible. But what if we want to compute a different statistic after the fact? In this paper, we consider the following fascinating possibility: can we collect some small amount of specific data during the stream that is "universal," i.e., where we do not know anything about the statistics we will want to later compute, other than the guarantee that had we known the statistic ahead of time, it would have been possible to do so with small memory? This is indeed what we introduce (and show) in this paper with matching upper and lower bounds: we show that it is possible to collect universal statistics of polylogarithmic size, and prove that these universal statistics allow us after the fact to compute all other statistics that are computable with similar amounts of memory. We show that this is indeed possible, both for the standard unbounded streaming model and the sliding window streaming model

    Fast Fencing

    Get PDF
    We consider very natural "fence enclosure" problems studied by Capoyleas, Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a set SS of nn points in the plane, we aim at finding a set of closed curves such that (1) each point is enclosed by a curve and (2) the total length of the curves is minimized. We consider two main variants. In the first variant, we pay a unit cost per curve in addition to the total length of the curves. An equivalent formulation of this version is that we have to enclose nn unit disks, paying only the total length of the enclosing curves. In the other variant, we are allowed to use at most kk closed curves and pay no cost per curve. For the variant with at most kk closed curves, we present an algorithm that is polynomial in both nn and kk. For the variant with unit cost per curve, or unit disks, we present a near-linear time algorithm. Capoyleas, Rote, and Woeginger solved the problem with at most kk curves in nO(k)n^{O(k)} time. Arkin, Khuller, and Mitchell used this to solve the unit cost per curve version in exponential time. At the time, they conjectured that the problem with kk curves is NP-hard for general kk. Our polynomial time algorithm refutes this unless P equals NP

    Online optimization with switching cost

    Get PDF
    We consider algorithms for "smoothed online convex optimization (SOCO)" problems. SOCO is a variant of the class of "online convex optimization (OCO)" problems that is strongly related to the class of "metrical task systems", each of which have been studied extensively. Prior literature on these problems has focused on two performance metrics: regret and competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithms achieve both. In this paper, we show that this is due to a fundamental incompatibility between regret and the competitive ratio -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear

    A Tale of Two Metrics: Simultaneous Bounds on Competitiveness and Regret

    Get PDF
    We consider algorithms for “smoothed online convex optimization” (SOCO) problems, which are a hybrid between online convex optimization (OCO) and metrical task system (MTS) problems. Historically, the performance metric for OCO was regret and that for MTS was competitive ratio (CR). There are algorithms with either sublinear regret or constant CR, but no known algorithm achieves both simultaneously. We show that this is a fundamental limitation – no algorithm (deterministic or randomized) can achieve sublinear regret and a constant CR, even when the objective functions are linear and the decision space is one dimensional. However, we present an algorithm that, for the important one dimensional case, provides sublinear regret and a CR that grows arbitrarily slowly

    Online optimization with switching cost

    Get PDF
    We consider algorithms for "smoothed online convex optimization (SOCO)" problems. SOCO is a variant of the class of "online convex optimization (OCO)" problems that is strongly related to the class of "metrical task systems", each of which have been studied extensively. Prior literature on these problems has focused on two performance metrics: regret and competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithms achieve both. In this paper, we show that this is due to a fundamental incompatibility between regret and the competitive ratio -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear